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Description 

Cross-Reference to Related Application 

[0001 ] The subject matter of this application is related 
to that of the U.S. Patent Application of J. Herre, entitled 
"Perceptual Noise Shaping in the Time Domain via LPC 
Prediction in the Frequency Domain," Ser. No. 
08/585086, filed on January 16, 1996 and assigned to 
the assignee of the present invention and corresponding 
to EP-A-0 785 631 published on 23.07.97. "Perceptual 
Noise Shaping in the Time Domain via LPC Prediction 
in the Frequency Domain" is hereby incorporated by ref- 
erence as if fully set forth herein. 

Field of the Invention 

[0002] The present invention relates to the field of au- 
dio signal coding and more specifically to an improved 
method and apparatus for performing joint stereo coding 
of multi-channel audio signals. 

Background of the Invention 

[0003] During the last several years so-called "per- 
ceptual audio coders" have been developed enabling 
the transmission and storage of high quality audio sig- 
nals at bit rates of about 1 /1 2 or less of the bit rate com- 
monly used on a conventional Compact Disc medium 
(CD). Such coders exploit the irrelevancy contained in 
an audio signal due to the limitations of the human au- 
ditory system by coding the signal with only so much 
accuracy as is necessary to result in a perceptually in- 
distinguishable reconstructed (i.e., decoded) signal. 
Standards have been established under various stand- 
ards organizations such as the International Standardi- 
zation Organization's Moving Picture Experts Group 
(ISO/MPEG) MPEG1 and MPEG2 audio standards. 
Perceptual audio coders are described in detail, for ex- 
ample, in U.S. Patent No. 5,285,498 issued to James 
D. Johnston on Feb. 8, 1994 and in U.S. Patent No. 
5,341,457 issued to Joseph L. Hall II and James D. 
Johnston on Aug. 23, 1994, each of which is assigned 
to the assignee of the present invention. 
[0004] Generally, the structure of a perceptual audio 
coder for monophonic audio signals can be described 
as follows: 

The input samples are converted into a subsampled 
spectral representation using various types of filter- 
banks and transforms such as, for example, the 
well-known modified discrete cosine transform 
(MDCT), polyphase filterbanks or hybrid structures. 

Using a perceptual model, one or more time-de- 
pendent masking thresholds for the signal are esti- 
mated. These thresholds give the maximum coding 
error that can be introduced into the audio signal 



while still maintaining perceptually unimpaired sig- 
nal quality. In particular, these masking thresholds 
may be individually determined on a sub-band by 
sub-band basis. That is, each coder frequency 
5 band, which comprises a grouping of one or more 

spectral coefficients, will be advantageously coded 
together based on a correspondingly determined 
masking threshold. 

10 • The spectral values are quantized and coded (on a 
coder frequency band basis) according to the pre- 
cision corresponding to the masking threshold esti- 
mates. In this way, the quantization noise may be 
hidden (i.e., masked) by the respective transmitted 

15 signal and is thereby not perceptible after decoding. 

Finally, all relevant information (e.g., coded spectral 
values and additional side information) is packed in- 
to a bitstream and transmitted to the decoder. 

20 

Accordingly, the processing used in a corresponding de- 
coder is reversed: 

The bitstream is decoded and parsed into coded 
25 spectral data and side information. 

The inverse quantization of the quantized spectral 
values is performed (on afrequency band basis cor- 
responding to that used in the encoder). 

30 

The spectral values are mapped back into a time 
domain representation using a synthesis filterbank. 

[0005] Using such a generic coder structure it is pos- 
35 sible to efficiently exploit the irrelevancy contained in 
each signal due to the limitations of the human auditory 
system. Specifically, the spectrum of the quantization 
noise can be shaped according to the shape of the sig- 
nal's noise masking threshold. In this way, the noise 
40 which results from the coding process can be "hidden" 
under the coded signal and, thus, perceptually transpar- 
ent quality can be achieved at high compression rates. 
[0006] Perceptual coding techniques for monophonic 
signals have been successfully extended to the coding 
45 of two-channel or multichannel stereophonic signals. In 
particular, so-called "joint stereo" coding techniques 
have been introduced which perform joint signal 
processing on the input signals, rather than performing 
separate (i.e., independent) coding processes for each 
50 input signal. (Note that as used herein, as used gener- 
ally, and as is well known to those of ordinary skill in the 
art, the words "stereo" and "stereophonic" refer to the 
use of two or more individual audio channels.) 
[0007] There are at least two advantages to the use 
55 of joint stereo coding techniques. First, the use of joint 
stereo coding methods provide for the ability to account 
for binaural psychoacoustic effects. And, second the re- 
quired bit rate for the coding of stereophonic signals may 
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be reduced significantly below the bit rate required to 
perform separate and independent encodings for each 
channel. 

[0008] Generally, the structure of a multi-channel ster- 
eophonic perceptual audio coder can be described as 
follows: 

The samples of each input signal are converted into 
a subsampled spectral representation using vari- 
ous types of filterbanks and transforms, such as, for 
example, the modified discrete cosine transform 
(MDCT), polyphase filterbanks or hybrid structures. 

Using a model, the time-dependent masking 
threshold of the signal is estimated for each chan- 
nel. This gives the maximum coding error that can 
be introduced into the audio signal while still main- 
taining perceptually unimpaired signal quality. 

To perform joint stereo coding, portions of the spec- 
tral coefficient data are jointly processed to achieve 
a more efficient representation of the stereo signal. 
Depending on the joint stereo coding method em- 
ployed, adjustments may be made to the masking 
thresholds as well. 

The spectral values are quantized and coded ac- 
cording to the precision corresponding to the mask- 
ing threshold estimate(s). In this way, the quantiza- 
tion noise is hidden (i.e., masked) by the respective 
transmitted signal and is thereby not perceptible af- 
ter decoding. 

Finally, all relevant information (i.e., the coded 
spectral values and additional side information) is 
packed into a bitstream and transmitted to the de- 
coder. 

Accordingly, the processing used in the encoder is re- 
versed in the decoder: 

The bitstream is decoded and parsed into coded 
spectral data and side information. 

The inverse quantization of the quantized spectral 
values is carried out. 

The decoding process for the joint stereo process- 
ing is performed on the spectral values, thereby re- 
sulting in separate signals for each channel. 

The spectral values for each channel are mapped 
back into time domain representations using corre- 
sponding synthesis filterbanks. 

[0009] Currently, the two most commonly used joint 
stereo coding techniques are known as "Mid/Side" (M/ 
S) stereo coding and "intensity" stereo coding. The 



structure and operation of a coder based on M/S stereo 
coding is described, for example, in U.S. Patent No. 
5,285,498 (see above). Using this technique, binaural 
masking effects can be advantageously accounted for 
5 and in addition, a certain amount of signal-dependent 
gain may be achieved. 

[0010] The intensity stereo method, however, pro- 
vides a higher potential for bit saving. In particular, this 
method exploits the limitations of the human auditory 

10 system at high frequencies (e.g., frequencies above 4 
kHz), by transmitting only one set of spectral coefficients 
for all jointly coded channel signals, thereby achieving 
a significant savings in data rate. Coders based on the 
intensity stereo principle have been described in numer- 

15 ous references including European Patent Application 
0 497 413 A by R. Veldhuis et ai., filed on January 24, 
1992 and published on August 5, 1992, and (using dif- 
ferent terminology) PCT patent application WO 
92/12607 by M. Davis et ai., filed on January 8, 1992 

20 and published on July 23, 1992. 

[0011] By applying joint stereo processing to the 
spectral coefficients prior to quantization, additional 
savings in terms of the required bit rate can be achieved. 
For the case of intensity stereo coding, some of these 

25 savings derive from the fact that the human auditory sys- 
tem is known to be insensitive to phase information at 
high frequencies (e.g., frequencies above 4 kHz). Due 
to the characteristics of human hair cells, signal enve- 
lopes are perceptually evaluated rather than the signal 

30 waveform itself. Thus, it is sufficient to code the enve- 
lope of these portions of a signal, rather than having to 
code its entire waveform. This may, for example, be ac- 
complished by transmitting one common set of spectral 
coefficients (referred to herein as the "carrier signal") for 

35 all participating channels, rather than transmitting sep- 
arate sets of coefficients for each channel. Then, in the 
decoder, the carrier signal is scaled independently for 
each signal channel to match its average envelope (or 
signal energy) for the respective coder block. 

40 [0012] The following processing steps are typically 
performed for intensity stereo encoding/decoding on a 
coder frequency band basis: 

From the spectral coefficients of all participating 
45 channels, one "carrier" signal is generated that is 
suited to represent the individual channel signals. 
This is usually done by forming linear combinations 
of the partial signals. 

50 . Scaling information is extracted from the original 
signals describing the envelope or energy content 
in the particular coder frequency band. 

Both the carrier signal and the scaling information 
55 are transmitted to the decoder. 

In the decoder, the spectral coefficients of the car- 
rier signal are reconstructed. The spectral coeffi- 
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cients for each channel are then calculated by scal- 
ing the carrier signal using the respective scaling 
information for each channel. 

[0013] As a result of this approach, only one set of 
spectral coefficients (i.e., the coefficients of the carrier 
signal) needs to be transmitted, together with a small 
amount of side information (i.e., the scaling information), 
instead of having to transmit a separate set of spectral 
components for each channel signal. For the two-chan- 
nel stereo case, this results in a saving of almost 50% 
of the data rate for the intensity coded frequency re- 
gions. 

[001 4] Despite the advantages of this approach, how- 
ever, excessive or uncontrolled application of the inten- 
sity stereo coding technique can lead to deterioration in 
the perceived stereo image, because the detailed struc- 
ture of the signals over time is not preserved for time 
periods smaller than the granularity of the coding 
scheme (e.g., 20ms per block). In particular, as a con- 
sequence of the use of a single carrier, all output signals 
which are reconstructed therefrom are necessarily 
scaled versions of each other. In other words, they have 
the same fine envelope structure for the duration of the 
coded block (e.g., 10-20ms). This does not present a 
significant problem for stationary signals or for signals 
having similar fine envelope structures in the intensity 
stereo coded channels. 

[001 5] For transient signals with dissimilar envelopes 
in different channels, however, the original distribution 
of the envelope onsets between the coded channels 
cannot be recovered. For example, in a stereophonic 
recording of an applauding audience, the individual en- 
velopes will be very different in the right and left chan- 
nels due to the distinct clapping events happening at dif- 
ferent times in both channels. Similar effects will occur 
for recordings by using stereophonic microphones, such 
that the spatial location of a sound source is, in essence, 
encoded as time differences or delays between the re- 
spective channel signals. Consequently, the stereo im- 
age quality of an intensity stereo coded/decoded signal 
will decrease significantly in these cases. The spatial im- 
pression tends to narrow, and the perceived stereo im- 
age tends to collapse into the center position. For critical 
signals, the achieved quality can no longer be consid- 
ered acceptable. 

[0016] Several strategies have been proposed in or- 
der to avoid deterioration in the stereo image of an in- 
tensity stereo encoded/decoded signal. Since using in- 
tensity stereo coding involves the risk of affecting the 
stereo image, it has been proposed to use the technique 
only in cases when the coder runs out of bits, so that 
severe quantization distortions, which would be per- 
ceived by the listener as being even more annoying, can 
be avoided. Alternatively, an algorithm can be employed 
which detects dissimilarities in the fine temporal struc- 
tures of the channels. If a mismatch in envelopes is de- 
tected, intensity stereo coding is not applied in the given 



block. Such an approach is described, for example, in 
"Intensity Stereo Coding" by J. Herre et al., 96th Audio 
Engineering Society Convention, Amsterdam. February 
1994. However, it is an obvious drawback of the prior 
5 proposed solutions that the potential for bit savings can 
no longer be fully exploited, given that the intensity ster- 
eo coding is disabled for such signals. 

Summary of the Invention 

10 

[0017] In accordance with an illustrative embodiment 
of the present invention, the drawbacks of prior art tech- 
niques are overcome by a method and apparatus for 
performing joint stereo coding of multi-channel audio 

15 signals using intensity stereo coding techniques. In par- 
ticular, predictive filtering techniques are applied to the 
spectral coefficient data, thereby preserving the fine 
time structure of the output signal of each channel, while 
maintaining the benefit of the high bit rate savings of- 

20 fered by intensity stereo coding. In one illustrative em- 
bodiment of the present invention, a method for enhanc- 
ing the perceived stereo image of intensity stereo en- 
coded/decoded signals is provided by applying the fol- 
lowing processing steps in an encoder for two-channel 

25 stereophonic signals: 

The input signal of each channel is decomposed in- 
to spectral coefficients by a high-resolution filter- 
bank/transform. 

30 

Using a model, the one or more time-dependent 
masking thresholds are estimated for each channel. 
This advantageously gives the maximum coding er- 
ror that can be introduced into the audio signal while 
35 still maintaining perceptually unimpaired signal 
quality. 

For each channel, a filter performing linear predic- 
tion in frequency is applied at the f ilterbank outputs, 
40 such that the residual, rather than the actual filter- 
bank output signal, is used for the steps which fol- 
low. 

Intensity stereo coding techniques are applied for 
45 coding both residual signals into one carrier signal. 

The spectral values of the carrier signal are quan- 
tized and coded according to the precision corre- 
sponding to the masking threshold estimate(s). 

50 

All relevant information (i.e., the coded spectral val- 
ues, intensity scaling data and prediction filter data 
for each channel, as well as additional side informa- 
tion) is packed into a bitstream and transmitted to 
55 the decoder. 

[0018] Similarly, a decoder for joint stereo encoded 
signals, corresponding to the above-described illustra- 
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tive encoder and in accordance with another illustrative 
embodiment of the present invention, carries out the fol- 
lowing processing steps: 

The bitstream is decoded and parsed into coded 
spectral data and side information. 

An inverse quantization of the quantized spectral 
values for the carrier signal is performed. 

Intensity stereo decoding is performed on the spec- 
tral values of the carrier signal, thereby producing 
(residual) signals for each channel. 

For each channel, inverse prediction filters, operat- 
ing in frequency and corresponding to the prediction 
filters applied by the encoder used to encode the 
original signal, are applied to the residual signals. 

The values produced by the inverse prediction fil- 
ters are mapped back into time domain representa- 
tions using synthesis filterbanks. 

Brief Description of the Drawings 

[0019] 

Fig. 1 shows a prior art encoder for two-channel 
stereophonic signals in which conventional intensi- 
ty stereo coding techniques are employed. 

Fig. 2 shows an encoder for two-channel stereo- 
phonic signals in accordance with an illustrative em- 
bodiment of the present invention. 

Fig. 3 shows an illustrative implementation of the 
predictive filters of the illustrative encoder of Fig. 2. 

Fig. 4 shows a prior art decoder for joint stereo cod- 
ed signals, corresponding to the prior art encoder 
of Fig. 1 , in which conventional intensity stereo cod- 
ing techniques are employed. 

Fig. 5 shows a decoder for joint stereo coded sig- 
nals, corresponding to the illustrative encoder of 
Fig. 2, in accordance with an illustrative embodi- 
ment of the present invention. 

Fig. 6 shows an illustrative implementation of the 
inverse predictive filters of the illustrative decoder 
of Fig. 5. 

Fig. 7 shows a flow chart of a method of encoding 
two-channel stereophonic signals in accordance 
with an illustrative embodiment of the present in- 
vention. 

Fig. 8 shows a flow chart of a method of decoding 



joint stereo coded signals, corresponding to the il- 
lustrative encoding method shown in Fig. 7, in ac- 
cordance with an illustrative embodiment of the 
present invention. 

5 

Detailed Description 
Overview 

w [0020] The incorporation of a predictive filtering proc- 
ess into the encoder and decoder in accordance with 
certain illustrative embodiments of the present invention 
advantageously enhances the quality of the intensity 
stereo encoded/decoded signal by overcoming the lim- 

15 itation of prior art schemes whereby identical fine enve- 
lope structures are produced in all intensity stereo de- 
coded channel signals. In particular, the illustrative en- 
coding method overcomes the drawbacks of prior tech- 
niques by effectively extending the filterbank with the 

20 predictive filtering stage, such that the envelope infor- 
mation common over frequency is extracted as filter co- 
efficients, and is, for the most part, stripped from the re- 
sidual signal. 

[0021] Specifically, for each input channel signal, a 
25 linear prediction is carried out on its corresponding 
spectral coefficient data, wherein the linear prediction is 
performed over frequency . Since predictive coding is 
applied to spectral domain data, the relations known for 
classical predictions are valid with the time and frequen- 
ce cy domains interchanged. For example, the prediction 
error signal ideally has a "flat" (square of the) envelope, 
as opposed to having a "flat" power spectrum (a "pre- 
whitening" filter effect). The fine temporal structure in- 
formation for each channel signal is contained in its pre- 
35 diction filter coefficients. Thus, it can be assumed that 
the carrier signal used for intensity stereo coding will al- 
so have a flat envelope, since it is generated by forming 
linear combinations of the (filtered) channel signals. 
[0022] In a corresponding decoder in accordance with 
40 an illustrative embodiment of the present invention, 
each channel signal is re-scaled according to the trans- 
mitted scaling information, and the inverse filtering proc- 
ess is applied to the spectral coefficients. In this way, 
the inverse "pre-whitening" process is performed on the 
45 envelope of each decoded channel signal, effectively re- 
introducing the envelope information into the spectral 
coefficients. Since this is done individually for each 
channel, the extended encoding/decoding system is ca- 
pable of reproducing different individual fine envelope 
50 structures for each channel signal. Note that, in effect, 
using a combination of filterbank and linear prediction 
in frequency is equivalent to using an adaptive filterbank 
matched to the envelope of the input signal. Since the 
process of envelope shaping a signal can be performed 
55 either for the entire spectrum of the signal or for only 
part thereof, this time-domain envelope control can be 
advantageously applied in any necessary frequency- 
dependent fashion. 
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[0023] And in accordance with another embodiment 
of the present invention, the bitstream which is, for ex- 
ample, by the illustrative encoder described above (and 
described in further detail below with reference to Figs. 
2, 3 and 7) may be advantageously stored on a storage 
medium such as a Compact Disc or a Digital Audio Tape, 
or stored in a semiconductor memory device. Such a 
storage medium may then be "read back" to supply the 
bitstream for subsequent decoding by, for example, the 
illustrative decoder described above (and described in 
further detail below with reference to Figs. 5, 6 and 8). 
In this manner, a substantial quantity of audio data (e. 
g., music) may be compressed onto the given storage 
medium without loss of (perceptual) quality in the recon- 
structed signal. 

A prior art encoder 

[0024] Fig. 1 shows a prior art perceptual encoder for 
two-channel stereophonic signals in which conventional 
intensity stereo coding techniques are employed. The 
encoder of Fig. 1 operates as follows: 

The left and right input signals, xl(k) and xr(k), are 
each individually decomposed into spectral coeffi- 
cients by analysis filterbank/transform modules 121 
and 12r, respectively, resulting in corresponding 
sets of "n" spectral components, yl(b,0 ... n-1) and 
yr(b,0 ... n-1), respectively, for each analysis block 
b. where "n" is the number of spectral coefficients 
per analysis block (i.e., the block size). Each spec- 
tral component yl(b,i) or yr(b,i) is associated with an 
analysis frequency in accordance with the particular 
filterbank employed. 

For each channel, perceptual model 11 1 or 11 r esti- 
mates the required coding precision for perceptual- 
ly transparent quality of the encoded/decoded sig- 
nal. This estimation data may, for example, be 
based on the minimum signal-to-noise ratio (SNR) 
required in each coder band, and is passed to the 
quantization/encoding module. 

The spectral values for both the left and the right 
channel, yl(b,0 ... n-1) and yr(b,0 ... n-1), are pro- 
vided to intensity stereo encoding module 1 3, which 
performs conventional intensity stereo encoding. 
For portions of the spectrum which are to be exclud- 
ed from intensity stereo coding, the corresponding 
values of yl(b,0 ... n-1) and yr(b,0 ... n-1) may be 
passed directly to the quantization and coding 
stage. For portions of the spectrum which are to 
make use of intensity stereo coding (i.e., preferably 
the high-frequency portions thereof), the intensity 
stereo coding process is performed as follows. 
From each of the signals yl() and yr(), scaling infor- 
mation is extracted for each coder frequency band 
(e.g., peak amplitude or total energy), and a single 



carrier signal yi() is generated by combining the cor- 
responding yl() and yr() values. Thus, for spectral 
portions coded in intensity stereo, only one set of 
values yi() for both channels, plus scaling side in- 
5 formation for each channel, is provided to the quan- 

tization and coding stage. Alternatively, combined 
scaling information for both channels together with 
directional information can be used (along with the 
single carrier signal). 

The spectral components at the output of the inten- 
sity stereo encoding stage, consisting of separate 
values yl() and yr() and common values yi(), are 
quantized and mapped to transmission symbols by 
quantization and encoding module 14. This module 
takes into account the required coding precision as 
determined by perceptual models 111 and 11r. 

The transmission symbol values generated by 
quantization and encoding module 1 4, together with 
further side information, are passed to bitstream en- 
coder/multiplexer 1 5 and are thereby transmitted in 
the encoded bitstream. For coder frequency bands 
which use intensity stereo coding, the scaling infor- 
mation delivered by intensity stereo encoding mod- 
ule 13 is also provided to bitstream encoder/multi- 
plexer 15 and thereby transmitted in the encoded 
bitstream as well. 



[0025] Fig. 2 shows an encoder for two-channel ster- 
eophonic signals in accordance with an illustrative em- 
bodiment of the present invention. The operation of the 
illustrative encoder of Fig. 2 is similar to that of the prior 
art encoder shown in Fig. 1 , except that, for each chan- 
nel, a predictive filtering stage is introduced between the 
corresponding analysis filterbank and the intensity ster- 
eo encoding module. That is, predictive filters 161 and 
1 6r are applied to the outputs of analysis filterbanks 1 21 
and 1 2r, respectively. As such, the spectral values, yl(b, 
0..,n-1) andyr(b,0...n-1), are replaced by the output val- 
ues of the predictive filtering process, yl'(b,0...n-1 ) and 
yr'(b,0...n-1), respectively, before being provided to in- 
tensity stereo encoding module 13. 
[0026] Fig. 3 shows an illustrative implementation of 
the predictive filters of the illustrative encoder of Fig. 2. 
Specifically, inside the predictive filtering stage for each 
channel, a linear prediction is performed across fre- 
quency (as opposed, for example, to predictive coding 
which is performed across time, such as is employed by 
subband-ADPCM coders). To this end "rotating switch" 
43 operates to bring spectral values y(b,0...n-1) into a 
serial order prior to processing, and the resulting output 
values y'(b.0...n-1 ) are provided in parallel thereafter by 
"rotating switch" 46. (Note that the use of "rotating 
switches" as a mechanism for conversion between se- 
rial and parallel orderings is used herein only for the pur- 
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pose of convenience and ease of understanding. As will 
be obvious to those of ordinary skill in the art, no such 
physical switching device need be provided. Rather, 
conversions between serial and parallel orderings may 
be performed in any of a number of conventional ways 
familiar to those skilled in the art, including by the use 
of software alone.) Although the illustrative embodiment 
shown herein performs the processing of the spectral 
values in order of increasing frequency, alternative em- 
bodiments may, for example, perform the processing 
thereof in order of decreasing frequency. Other order- 
ings are also possible, as would be clear to one of ordi- 
nary skill in the art. 

[0027] Specifically, as can be seen from the figure, the 
resultant output values. y'(b.0...n-1 ), are computed from 
the input values, y(b,0...n-1 ), by subtracting (with use of 
subtractor 48) the predicted value (predicted by predic- 
tor 47) from the input values, so that only the prediction 
error signal is passed on. Note that the combination of 
predictor 47 and subtractor 48, labelled in the figure as 
envelope pre-whitening filter 44, functions to equalize 
the temporal shape of the corresponding time signal. 
[0028] The process performed by predictive filters 
161 and 16r of the illustrative encoder of Fig. 2 can be 
performed either for the entire spectrum (i.e., for all 
spectral coefficients), or, alternatively, for only a portion 
of the spectrum (i.e., a subset of the spectral coeffi- 
cients). Moreover, different predictor filters (e.g., differ- 
ent predictors 47 as shown in Fig. 3) can be used for 
different portions of the signal spectrum. In this manner, 
the above-described method for time-domain envelope 
control can be applied in any necessary frequency-de- 
pendent fashion. 

[0029] In order to enable the proper decoding of the 
signal the bitstream advantageously includes certain 
additional side information. For example, one field of 
such information might indicate the use of predictive fil- 
tering and if applicable, the number of different predic- 
tion filters. If predictive filtering is used, additional fields 
in the bitstream may be transmitted for each prediction 
filter indicating the target frequency range of the respec- 
tive filter and its filter coefficients. Thus, as shown in Fig. 
2 by the dashed lines labelled "L Filter Data" and "R Fil- 
ter Data," predictive filters 1 61 and 1 6r provide the nec- 
essary information to bitstream encoder/multiplexer 17 
for inclusion in the transmitted bitstream. 
[0030] Fig. 7 shows a flow chart of a method of en- 
coding two-channel stereophonic signals in accordance 
with an illustrative embodiment of the present invention. 
The illustrative example shown in this flow chart imple- 
ments certain relevant portions of the illustrative encod- 
er of Fig. 2. Specifically, the flow chart shows the front- 
end portion of the encoder for a single one of the chan- 
nels, including the envelope pre-whitening process us- 
ing a single prediction filter. This pre-whitening process 
is carried out after the calculation of the spectral values 
by the analysis filterbank, as shown in step 61 of the 
figure. 



[0031] Specifically, after the analysis filterbank is run, 
the order of the prediction filter is set and the target fre- 
quency range is defined (step 62). These parameters 
may illustratively be set to a filter order of 1 5 and a target 

5 frequency range comprising the entire frequency range 
that will be coded using intensity stereo coding (e.g., 
from 4 kHz to 20 kHz). In this manner, the scheme is 
advantageously configured to provide one set of individ- 
ual fine temporal structure data for each audio channel. 

10 In step 63, the prediction filter is determined by using 
the range of spectral coefficients matching the target fre- 
quency range, and by applying a conventional method 
for predictive coding as is well known, for example, in 
the context of Differential Pulse Code Modulation 

15 (DPCM) coders. For example, the autocorrelation func- 
tion of the coefficients may be calculated and used in a 
conventional Levinson-Durbin recursion algorithm, well 
known to those skilled in the art. As a result, the predictor 
filter coefficients, the corresponding reflection coeffi- 

20 cients ("PARCOR" coefficients), and the expected pre- 
diction gain are known. 

[0032] If the expected prediction gain exceeds a cer- 
tain threshold (e.g., 2 dB), as determined by decision 
64, the predictive filtering procedure of steps 65 through 

25 67 is used. In this case, the prediction filter coefficients 
are quantized (in step 65) as required for transmission 
to the decoder as part of the side information. Then, in 
step 66, the prediction filter is applied to the range of 
spectral coefficients matching the target frequency 

30 range where the quantized filter coefficients are used. 
For all further processing, therefore, the spectral coeffi- 
cients are replaced by the output of the filtering process. 
Finally, in step 67, a field of the bitstream to be transmit- 
ted is set to indicate the use of predictive filtering ("pre- 

35 diction flag" on). In addition, the target frequency range, 
the order of the prediction filter, and information describ- 
ing its filter coefficients are also included in the bit- 
stream. 

[0033] If, on the other hand, the expected prediction 

40 gain does not exceed the decision threshold as deter- 
mined by decision 64, step 68 sets a field in the bit- 
stream to indicate that no predictive filtering has been 
used ("prediction flag" off). Finally, after the above-de- 
scribed processing is complete, conventional steps as 

45 performed in prior art encoders (such as those carried 
out by the encoder of Fig. 1 ) are performed -- that is, the 
intensity stereo encoding process is applied to the spec- 
tral coefficients (which may now be residual data), the 
results of the intensity stereo encoding process are 

50 quantized and encoded, and the actual bitstream to be 
transmitted is encoded for transmission (with the appro- 
priate side information multiplexed therein). Note, how- 
ever, that bitstream encoder/multiplexer 17 of the illus- 
trative encoder of Fig. 2 replaces conventional bitstream 

55 encoder/multiplexer 1 5 of the prior art encoder of Fig. 1 . 
so that the additional side information provided by pre- 
dictive filters 161 and 16r (i.e., "L Filter Data" and "R 
Filter Data") may be advantageously encoded and 
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transmitted in the resultant bitstream. 
A prior art decoder 

[0034] Fig. 4 shows a prior art decoder for joint stereo 
coded signals, corresponding to the prior art encoder of 
Fig. 1 , in which conventional intensity stereo coding 
techniques are employed. Specifically, the decoder of 
Fig. 4 performs the following steps: 

The incoming bitstream is parsed by bitstream de- 
coder/demultiplexer 21 , and the transmission sym- 
bols for the spectral coefficients are passed on to 
decoding and inverse quantization module 22, to- 
gether with the quantization related side informa- 
tion. 

In decoding and inverse quantization module 22, 
the quantized spectral values, yql(), yqr() and yqi(), 
are reconstructed. These signals correspond to the 
independently coded left channel signal portion, the 
independently coded right channel signal portion, 
and the intensity stereo carrier signal, respectively. 

From the reconstructed spectral values of the car- 
rier signal and the transmitted scaling information, 
the missing portions of the yql() and yqr() spectra 
for the left and right channel signals are calculated 
with use of a conventional intensity stereo decoding 
process, which is performed by intensity stereo de- 
coding module 23. At the output of this module, two 
complete (and independent) channel spectral sig- 
nals, yql() and yqr(), corresponding to the left and 
right channels, respectively, are available. 

Finally, each of the left and right channel spectral 
signals, yql() and yqr(), are mapped back into a time 
domain representation by synthesis filterbanks 241 
and 24r, respectively, thereby resulting in the final 
output signals xl'(k) and xr'(k). 

An illustrative decoder 

[0035] Fig. 5 shows a decoder for joint stereo coded 
signals, corresponding to the illustrative encoder of Fig. 
2, in accordance with an illustrative embodiment of the 
present invention. The operation of the illustrative de- 
coder of Fig. 5 is similar to that of the prior art decoder 
shown in Fig. 4, except that, for each channel, an in- 
verse predictive filtering stage is introduced between the 
intensity stereo decoding and the corresponding syn- 
thesis filterbanks. That is, inverse predictive filters 261 
and 26r are inserted prior to synthesis filterbanks 241 
and 24r, respectively. Thus, the spectral values, yql() 
and yqr(), as generated by intensity stereo decoding 
module 23, are replaced by the output values of the cor- 
responding inverse predictive filtering processes, yqP() 
and yqr'(), respectively, before being provided to their 



corresponding synthesis filterbanks (synthesis filter- 
banks 24I and 24r). 

[0036] Fig. 6 shows an illustrative implementation of 
the inverse predictive filters of the illustrative decoder of 

5 Fig. 5. Specifically, within the inverse predictive filters, 
a linear filtering operation is performed across (as op- 
posed to performing predictive coding across time as in 
subband-ADPCM coders). In a similar manner to that 
shown in the prediction filter implementation of Fig. 3, 

10 "rotating switch" 33 of Fig. 6 is used to bring the spectral 
values yq(b,0 ... n-1 ) into a serial order prior to process- 
ing, and "rotating switch" 36 of the figure is used to bring 
the resulting output values yq'(b,0 ... n-1 ) into a parallel 
order thereafter. (Once again, note that the use of "ro- 

15 tating switches" as a mechanism for conversion be- 
tween serial and parallel orderings is provided herein 
only for the purpose of convenience and ease of under- 
standing. As will be obvious to those of ordinary skill in 
the art, no such physical switching device need be pro- 

20 vided. Rather, conversions between serial and parallel 
orderings may be performed in any of a number of con- 
ventional ways familiar to those skilled in the art, includ- 
ing by the use of software alone.) Again, as in the case 
of the illustrative encoder described above, processing 

25 in order of increasing or decreasing frequency is possi- 
ble, as well as other possible orderings obvious to those 
skilled in the art. 

[0037] Specifically, as can be seen from the figure, the 
output values, yq'(b.O ... n-1 ), are computed from the in- 

30 put values, yq(b,0 ... n-1 ), by applying the inverse of the 
envelope pre-whitening filter used in the corresponding 
encoder. In particular, the output values are computed 
from the input values by adding (with use of adder 38) 
the predicted values (predicted by predictor 37) to the 

35 input values as shown. Note that the combination of pre- 
dictor 37 and adder 38, labelled in the figure as envelope 
shaping filter 34, functions to re-introduce the temporal 
shape of the original time signal. 
[0038] As described above in the discussion of the il- 

40 lustrative encoder of Figs. 2 and 3, the above-described 
filtering process can be performed either for the entire 
spectrum (i.e., for all spectral coefficients), or for only a 
portion of the spectrum (i.e., a subset of the spectral co- 
efficients). Moreover, different predictor filters (e.g., dif- 

45 ferent predictors 37 as shown in Fig. 6) can be used for 
different parts of the signal spectrum. In such a case (in 
order to execute the proper decoding of the signal), the 
illustrative decoder of Fig. 5 advantageously decodes 
from the bitstream the additional side information (la- 

50 belled in the figure as "L Filter Data" and "R Filter Data") 
which had been transmitted by the encoder, and sup- 
plies this data to inverse predictive filters 26I and 26r. In 
this manner, predicative decoding can be applied in 
each specified target frequency range with a corre- 

55 sponding prediction filter. 

[0039] Fig. 8 shows a flow chart of a method of de- 
coding joint stereo coded signals, corresponding to the 
illustrative encoding method shown in Fig. 7, in accord- 
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ance with an illustrative embodiment of the present in- 
vention. The illustrative example shown in this flow chart 
implements certain relevant portions of the illustrative 
decoder of Fig. 5. Specifically, the flow chart shows the 
back-end portion of the decoder for a single one of the 
channels, including the envelope shaping process using 
a single (inverse) prediction filter. The processing which 
is performed by the decoder prior to those steps shown 
in the flow chart of Fig. 8 comprises conventional steps 
performed in prior art decoders (such as those carried 
out by the decoder of Fig. 4) - that is, the bitstream is 
decoded/demultiplexed, the resultant data is decoded 
and inverse quantized, and the intensity stereo decod- 
ing process is performed. Note, however, that bitstream 
decoder/demultiplexer 25 of the illustrative decoder of 
Fig. 5 replaces conventional bitstream decoder/demul- 
tiplexer 21 of the prior art decoder of Fig. 4, so that the 
additional side information provided by the encoder (e. 
g., "L Filter Data" and "R Filter Data") may be advanta- 
geously decoded and provided to inverse predictive fil- 
ters 26I and 26r. 

[0040] After the intensity stereo decoding has been 
completed, the data from the bitstream which signals the 
use of predictive filtering is checked (by decision 72). If 
the data indicates that predictive filtering was performed 
in the encoder (i.e., the "prediction flag" is on), then the 
extended decoding process of steps 73 and 74 is carried 
out. Specifically, the target frequency range of the pre- 
diction filtering, the order of the pre-whitening (predic- 
tion) filter, and information describing the coefficients of 
the filter are retrieved from the (previously decoded) 
side information (step 73). Then, the inverse (decoder) 
prediction filter (i.e., the envelope shaping filter) is ap- 
plied to the range of spectral coefficients matching the 
target frequency range (step 74). In either case (i.e., 
whether predictive filtering was performed or not), the 
decoder processing completes by running the synthesis 
filterbank (for each channel) from the spectral coeffi- 
cients (as processed by the envelope shaping filter, if 
applicable), as shown in step 75. 

Conclusion 

[0041] Using the above-described process in accord- 
ance with the illustrative embodiments of the present in- 
vention (i.e., predictive filtering in the encoder and in- 
verse filtering in the decoder), a straightforward enve- 
lope shaping effect can be achieved for certain conven- 
tional block transforms including the Discrete Fourier 
Transform (DFT) or the Discrete Cosine Transform 
(DCT), both well-known to those of ordinary skill in the 
art. If, for example, a perceptual coder in accordance 
with the present invention uses a critically subsampled 
filterbank with overlapping windows - e.g., a conven- 
tional Modified Discrete Cosine Transform or another 
conventional filterbank based on Time Domain Aliasing 
Cancellation (TDAC) - the resultant envelope shaping 
effect is subject to the time domain aliasing effects in- 



herent in the filterbank. For example, in the case of a 
MDCT, one mirroring (i.e., aliasing) operation per win- 
dow half takes place, and the fine envelope structure 
appears mirrored (i.e., aliased) within the left and the 
5 right half of the window after decoding, respectively. 
Since the final filterbank output is obtained by applying 
a synthesis window to the output of each inverse trans- 
form and performing an overlap-add of these data seg- 
ments, the undesired aliased components are attenuat- 
10 ed depending on the synthesis window used. Thus, it is 
advantageous to choose a filterbank window that exhib- 
its only a small overlap between subsequent blocks, so 
that the temporal aliasing effect is minimized. An appro- 
priate strategy in the encoder can, for example, adap- 
ts tively select a window with a low degree of overlap for 
critical signals, thereby providing improved frequency 
selectivity. The implementation details of such a strate- 
gy will be obvious to those skilled in the art. 
[0042] Although a number of specific embodiments of 
20 this invention have been shown and described herein, 
it is to be understood that these embodiments are mere- 
ly illustrative of the many possible specific arrange- 
ments which can be devised in application of the princi- 
ples of the invention. For example, although the illustra- 
25 tive embodiments which have been shown and de- 
scribed herein have been limited to the encoding and 
decoding of stereophonic audio signals comprising only 
two channels, alternative embodiments which may be 
used for the encoding and decoding of stereophonic au- 
30 dio signals having more than two channels will be obvi- 
ous to those of ordinary skill in the art based on the dis- 
closure provided herein. In addition, numerous and var- 
ied other arrangements can be devised in accordance 
with these principles by those of ordinary skill in the art 
35 without departing from the scope of the invention. 



Claims 

40 1. A method of performing joint stereo coding of a mul- 
tichannel audio signal to generate an encoded sig- 
nal, the method comprising the steps of: 

(a) performing a spectral decomposition of a 
45 first audio channel signal into a plurality of first 

spectral component signals; 

(b) generating a first prediction signal repre- 
sentative of a prediction of one of said first 
spectral component signals, said prediction 

50 based on one or more other ones of said first 

spectral component signals; 

(c) comparing the first prediction signal with 
said one of said first spectral component sig- 
nals to generate a first prediction error signal; 

55 (d) performing a spectral decomposition of a 

second audio channel signal into a plurality of 
second spectral component signals; 
(e) performing joint stereo coding of said one 
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of said first spectral component signals and one 
of said second spectral component signals to 
generate a jointly coded spectral component 
signal, said coding based on the first prediction 
error signal; and 5 

(f) generating the encoded signal based on the 
jointly coded spectral component signal. 

2. The method of claim 1 further comprising the steps 
of: 10 

(g) generating a second prediction signal rep- 
resentative of a prediction of said one of said 
second spectral component signals, said pre- 
diction based on one or more other ones of said 15 
second spectral component signals; and 

(h) comparing the second prediction signal with 
said one of said second spectral component 
signals to generate a second prediction error 
signal; 20 

and wherein the step of performing joint ster- 
eo coding of said one of said first spectral compo- 
nent signals and said one of said second spectral 
component signals is further based on said second 25 
prediction error signal. 



ate a reconstructed multichannel audio signal, the 
encoded signal comprising a joint stereo coding of 
an original multichannel audio signal, the method 
comprising the steps of: 

(a) performing joint stereo decoding of the en- 
coded signal to generate a plurality of decoded 
channel signals, each decoded channel signal 
comprising a plurality of decoded spectral com- 
ponent prediction error signals; 

(b) generating a first spectral component signal 
based on one or more of said spectral compo- 
nent prediction error signals comprised in afirst 
one of said decoded channel signals; 

(c) generating a first prediction signal repre- 
sentative of a prediction of a second spectral 
component signal, said prediction based on 
said first spectral component signal; 

(d) generating the second spectral component 
signal based on the first prediction signal and 
on one or more of said spectral component pre- 
diction error signals comprised in the first one 
of said decoded channel signals; and 

(e) generating a first channel of the reconstruct- 
ed multi-channel audio signal based on the first 
and second spectral component signals. 



3. The method of claim 1 wherein the step of perform- 
ing joint stereo coding of said one of said first spec- 
tral component signals and said one of said second 30 
spectral component signals comprises performing 
intensity stereo coding of said one of said first spec- 
tral component signals and said one of said second 
spectral component signals. 

35 

4. The method of claim 1 wherein the step of generat- 
ing the encoded signal based on the jointly coded 
spectral component signal comprises quantizing 
the jointly coded spectral component signal. 

40 

5. The method of claim 4 wherein said quantization of 
the jointly coded spectral component signal is 
based on a perceptual model. 



6. A method as claimed in any of the preceding claims 45 
further including the step of storing said encoded 
signal on a storage medium. 

7. The method of claim 6 wherein the storage medium 
comprises a compact disc. so 

8. The method of claim 6 wherein the storage medium 
comprises a digital audio tape. 

9. The method of claim 6 wherein the storage medium 55 
comprises a semiconductor memory. 

10. A method of decoding an encoded signal to gener- 



11. The method of claim 1 0 furthercomprising the steps 
of: 

(f) generating a third spectral component signal 
based on one or more of said spectral compo- 
nent prediction error signals comprised in a 
second one of said decoded channel signals; 

(g) generating a second prediction signal rep- 
resentative of a prediction of a fourth spectral 
component signal, said prediction based on 
said third spectral component signal; 

(h) generating the fourth spectral component 
signal based on the second prediction signal 
and on one or more of said spectral component 
prediction error signals comprised in the sec- 
ond one of said decoded channel signals; and 

(i) generating a second channel of the recon- 
structed multichannel audio signal based on 
the third and fourth spectral component signals. 

12. The method of claim 10 wherein the step of perform- 
ing joint stereo decoding of the encoded signal com- 
prises performing intensity stereo decoding of the 
encoded signal. 

13. An encoder for performing joint stereo coding of a 
multi-channel audio signal to generate an encoded 
signal, the encoder comprising: 

(a) a first filterbank (1 21, 1 2r) which performs a 
spectral decomposition of a first audio channel 
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signal into a plurality of first spectral component 
signals; 

(b) a first prediction filter (161, 16r, 47) which 
generates a first prediction signal representa- 
tive of a prediction of one of said first spectral 5 
component signals, said prediction filter re- 
sponsive to one or more other ones of said first 
spectral component signals; 

(c) a first comparator (48) which compares the 
first prediction signal with said one of said first 10 
spectral component signals to generate a first 
prediction error signal; 

(d) a second filterbank (121, 12r) which per- 
forms a spectral decomposition of a second au- 
dio channel signal into a plurality of second 15 
spectral component signals; 

(e) a joint stereo coder (13) which performs joint 
stereo coding of said one of said first spectral 
component signals and one of said second 
spectral component signals to generate a joint- 20 
ly coded spectral component signal, said cod- 
ing based on the first prediction error signal; 
and 

(f) a coder (14) which generates the encoded 
signal based on the jointly coded spectral com- 25 
ponent signal. 



14. The encoder of claim 13 further comprising: 

(g) a second prediction filter (161, 16r, 47) which 30 
generates a second prediction signal repre- 
sentative of a prediction of said one of said sec- 
ond spectral component signals, said predic- 
tion based on one or more other ones of said 
second spectral component signals; and 35 

(h) a second comparator (48) which compares 
the second prediction signal with said one of 
said second spectral component signals to 
generate a second prediction error signal; 

40 

and wherein the joint stereo coder performs 
joint stereo coding further based on said second 
prediction error signal. 

15. The encoder of claim 13 wherein the joint stereo 45 
coder comprises an intensity stereo coder which 
performs intensity stereo coding of said one of said 
first spectral component signals and said one of 
said second spectral component signals. 

50 

16. The encoder of claim 13 wherein the coder which 
generates the encoded signal based on the jointly 
coded spectral component signal comprises a 
quantizer which quantizes the jointly coded spectral 
component signal. 55 

17. The encoder of claim 16 wherein the quantizer is 
based on a perceptual model. 



18. A decoder for decoding an encoded signal to gen- 
erate a reconstructed multichannel audio signal, the 
encoded signal comprising a joint stereo coding of 
an original multichannel audio signal, the method 
comprising: 

(a) a joint stereo decoder (23) which performs 
joint stereo decoding of the encoded signal to 
generate a plurality of decoded channel sig- 
nals, each decoded channel signal comprising 
a plurality of decoded spectral component pre- 
diction error signals; 

(b) means for generating a first spectral com- 
ponent signal based on one or more of said 
spectral component prediction error signals 
comprised in a first one of said decoded chan- 
nel signals; 

(c) a first prediction filter (26I, 26r) which gen- 
erates a first prediction signal representative of 
a prediction of a second spectral component 
signal, said prediction based on said first spec- 
tral component signal; 

(d) means for generating the second spectral 
component signal based on the first prediction 
signal and on one or more of said spectral com- 
ponent prediction errorsignals comprised in the 
first one of said decoded channel signals; and 

(e) a first filterbank (24I, 24r) which generates 
a first channel of the reconstructed multichan- 
nel audio signal based on the first and second 
spectral component signals. 

19. The decoder of claim 1 8 further comprising: 

(f) means for generating a third spectral com- 
ponent signal based on one or more of said 
spectral component prediction error signals 
comprised in a second one of said decoded 
channel signals; 

(g) a second prediction filter which generates a 
second prediction signal representative of a 
prediction of a fourth spectral component sig- 
nal, said prediction based on said third spectral 
component signal; 

(h) means for generating the fourth spectral 
component signal based on the second predic- 
tion signal and on one or more of said spectral 
component prediction error signals comprised 
in the second one of said decoded channel sig- 
nals; and 

(i) a second filterbank which generates a sec- 
ond channel of the reconstructed multi-channel 
audio signal based on the third and fourth spec- 
tral component signals. 

20. The decoder of claim 18 wherein the joint stereo de- 
coder comprises an intensity stereo decoder which 
performs intensity stereo decoding of the encoded 
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signal. 



Patentanspruche 

1. Verfahren zum Kombinationsstereokodieren eines 
Mehrkanal-Audiosignals, urn ein kodiertes Signal 
zu erzeugen, wobei das Verfahren die folgenden 
Schritte aufweist: 

(a) Ausfuhren einer Spektralzerlegung eines 
ersten Audiokanalsignals in eine Mehrzahl an 
ersten Spektralkomponentensignalen; 

(b) Erzeugen eines ersten Pradiktionssignals, 
das eine Vorhersage eines der ersten Spektral- 
komponentensignale darstellt, wobei die Vor- 
hersage auf einem oder mehreren weiteren der 
ersten Spektralkomponentensignale beruht; 

(c) Vergleichen des ersten Pradiktionssignals 
mit jenem einen der ersten Spektralkomponen- 
tensignale, um ein erstes Pradiktionsfehlersi- 
gnal zu erzeugen; 

(d) Ausfuhren einer Spektralzerlegung eines 
zweiten Audiokanalsignals in eine Mehrzahl an 
zweiten Spektralkomponentensignalen; 

(e) Kombinationsstereokodieren jenes einen 
der ersten Spektralkomponentensignale sowie 
eines der zweiten Spektralkomponentensigna- 
le, um ein kombinationskodiertes Spektralkom- 
ponentensignal zu erzeugen, wobei die Kodie- 
rung auf dem ersten Pradiktionsfehlersignal 
beruht; und 

(f) Erzeugen des kodierten Signals auf der 
Grundlage des kombinationskodierten Spek- 
tralkomponentensignals. 

2. Verfahren nach Anspruch 1 , welches au Gerdem die 
folgenden Schritte aufweist: 

(g) Erzeugen eines zweiten Pradiktionssignals, 
das eine Vorhersage jenes einen der zweiten 
Spektralkomponentensignale darstellt, wobei 
die Vorhersage auf einem oder mehreren wei- 
teren der zweiten Spektralkomponentensigna- 
le beruht; und 

(h) Vergleichen des zweiten Pradiktionssignals 
mit jenem einen der zweiten Spektralkompo- 
nentensignale, um ein zweites Pradiktionsfeh- 
lersignal zu erzeugen; 

und wobei der Schritt des Kombinationsstereoko- 
dierens jenes einen der ersten Spektralkomponen- 
tensignale sowie jenes einen der zweiten Spektral- 
komponentensignale auBerdem auf dem zweiten 
Pradiktionsfehlersignal beruht. 

3. Verfahren nach Anspruch 1 , wobei der Schritt des 
Kombinationsstereokodierens jenes einen der er- 



sten Spektralkomponentensignale sowie jenes ei- 
nen der zweiten Spektralkomponentensignale das 
Intensitatsstereokodieren jenes einen der ersten 
Spektralkomponentensignale sowie jenes einen 
5 der zweiten Spektralkomponentensignale umfasst. 

4. Verfahren nach Anspruch 1 , wobei der Schritt des 
Erzeugens des kodierten Signals auf der Grundla- 
ge des kombinationskodierten Spektralkomponen- 

10 tensignals das Quantisieren des kombinationsko- 
dierten Spektralkomponentensignals umfasst. 

5. Verfahren nach Anspruch 4, wobei die Quantisie- 
rung des kombinationskodierten Spektralkompo- 

15 nentensignals auf einem Wahrnehmungsmodell 
beruht. 

6. Verfahren nach einem der vorhergehenden Anspru- 
che, das weiterhin den Schritt des Speicherns des 

20 kodierten Signals auf einem Speichermedium um- 
fasst. 

7. Verfahren nach Anspruch 6, wobei das Speicher- 
medium eine Compact Disc umfasst. 

25 

8. Verfahren nach Anspruch 6, wobei das Speicher- 
medium ein Digitaltonband umfasst. 

9. Verfahren nach Anspruch 6, wobei das Speicher- 
30 medium einen Halbleiterspeicher umfasst. 

10. Verfahren zum Dekodieren eines kodierten Signals, 
um ein rekonstruiertes Mehrkanal-Audiosignal zu 
erzeugen, wobei das kodierte Signal eine Kombi- 

35 nationsstereokodierung eines ursprunglichen 
Mehrkanal-Audiosignals umfasst, und wobei das 
Verfahren die folgenden Schritte aufweist: 

(a) Kombinationsstereodekodieren des kodier- 
40 ten Signals, um eine Mehrzahl an dekodierten 

Kanalsignalen zu erzeugen, wobei jedes deko- 
dierte Kanalsignal eine Mehrzahl an dekodier- 
ten Spektralkomponenten-Pradiktionsfehlersi- 
gnalen umfasst; 
45 (b) Erzeugen eines ersten Spektralkomponen- 

tensignals auf der Grundlage eines oder meh- 
rerer dieser Spektralkomponenten-Pradikti- 
onsfehlersignale, die in einem ersten der deko- 
dierten Kanalsignale inbegriffen sind; 
50 (c) Erzeugen eines ersten Pradiktionssignals, 

das eine Vorhersage eines zweiten Spektral- 
komponentensignals darstellt, wobei die Vor- 
hersage auf dem ersten Spektralkomponen- 
tensignal beruht; 
55 (d) Erzeugen des zweiten Spektralkomponen- 

tensignals auf der Grundlage des ersten Pra- 
diktionssignals sowie eines oder mehrerer der 
Spektralkomponenten-Pradiktionsfehlersigna- 
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le, die in dem ersten der dekodierten Kanalsi- 
gnale inbegriffen sind; und 

(e) Erzeugen eines ersten Kanals des rekon- 
struierten Mehrkanal-Audiosignals auf der 
Grundlage der ersten und zweiten Spektral- 5 
komponentensignale. 

11. Verfahren nach Anspruch 10, welches auGerdem 
die folgenden Schritte aufweist: 

10 

(f) Erzeugen eines dritten Spektralkomponen- 
tensignals auf der Grundlage eines oder meh- 
rerer der Spektralkomponenten-Pradiktions- 
fehlersignale, die in einem zweiten der deko- 
dierten Kanalsignale inbegriffen sind; 15 

(g) Erzeugen eines zweiten Pradiktionssignals, 
das eine Vorhersage eines vierten Spektral- 
komponentensignals darstellt, wobei die Vor- 
hersage auf dem dritten Spektralkomponen- 
tensignal beruht; 20 

(h) Erzeugen des vierten Spektralkomponen- 
tensignals auf der Grundlage des zweiten Pra- 
diktionssignals sowie eines oder mehrerer der 
Spektralkomponenten-Pradiktionsfehlersigna- 

le, die in dem zweiten der dekodierten Kanalsi- 25 
gnale inbegriffen sind; und 

(i) Erzeugen eines zweiten Kanals des rekon- 
struierten Mehrkanal-Audiosignals auf der 
Grundlage der dritten und vierten Spektralkom- 
ponentensignale. 30 

12. Verfahren nach Anspruch 1 0, wobei der Schritt des 
Kombinationsstereodekodierens des kodierten Si- 
gnals das Intensitatsstereodekodieren des kodier- 
ten Signals umfasst. 35 

13. Kodierer zum Kombinationsstereokodieren eines 
Mehrkanal-Audiosignals, urn ein kodiertes Signal 
zu erzeugen, wobei der Kodierer Folgendes auf- 
weist: 40 

(a) eine erste Filterbank (121, 12r), die eine 
Spektralzerlegung eines ersten Audiokanalsi- 
gnals in eine Mehrzahl an ersten Spektralkom- 
ponentensignalen ausfuhrt; 45 

(b) einen ersten Pradiktionsfilter (161,1 6r, 47), 
der ein erstes Pradiktionssignal erzeugt, wel- 
ches eine Vorhersage eines der ersten Spek- 
tralkomponentensignale darstellt, wobei der 
Pradiktionsfilter auf eines oder mehrere weitere 50 
der ersten Spektralkomponentensignale an- 
spricht; 

(c) einen ersten Vergleicher (48), der das erste 
Pradiktionssignal mit jenem einen der ersten 
Spektralkomponentensignale vergleicht, urn 55 
ein erstes Pradiktionsfehlersignal zu erzeugen; 

(d) eine zweite Filterbank (121, 12r), die eine 
Spektralzerlegung eines zweiten Audiokanalsi- 



gnals in eine Mehrzahl an zweiten Spektral- 
komponentensignalen ausfuhrt; 

(e) einen Kombinationsstereokodierer (13), der 
das Kombinationsstereokodieren jenes einen 
der ersten Spektralkomponentensignale sowie 
eines der zweiten Spektralkomponentensigna- 
le ausfuhrt, urn ein kombinationskodiertes 
Spektralkomponentensignal zu erzeugen, wo- 
bei die Kodierung auf dem ersten Pradiktions- 
fehlersignal beruht; und 

(f) einen Kodierer (14), der das kodierte Signal 
auf der Grundlage des kombinationskodierten 
Spektralkomponentensignals erzeugt. 

1 4. Kodierer nach Anspruch 1 3, welcher au Gerdem Fol- 
gendes umfasst: 

(g) einen zweiten Pradiktionsfilter (161, 16r, 
47), der ein zweites Pradiktionssignal erzeugt, 
welches eine Vorhersage jenes einen der zwei- 
ten Spektralkomponentensignale darstellt, wo- 
bei die Vorhersage auf einem oder mehreren 
weiteren der zweiten Spektralkomponentensi- 
gnale beruht; und 

(h) einen zweiten Vergleicher (48), der das 
zweite Pradiktionssignal mit jenem einen der 
zweiten Spektralkomponentensignale ver- 
gleicht, urn ein zweites Pradiktionsfehlersignal 
zu erzeugen; 

und wobei der Kombinationsstereokodierer das 
Kombinationsstereokodieren au Gerdem auf der 
Grundlage des zweiten Pradiktionsfehlersignals 
ausfuhrt. 

15. Kodierer nach Anspruch 13, wobei der Kombinati- 
onsstereokodierer einen Intensitatsstereokodierer 
umfasst, welcher das Intensitatsstereokodieren je- 
nes einen der ersten Spektralkomponentensignale 
sowie jenes einen der zweiten Spektralkomponen- 
tensignale ausfuhrt. 

1 6. Kodierer nach Anspruch 1 3, wobei der Kodierer, der 
das kodierte Signal auf der Grundlage des kombi- 
nationskodierten Spektralkomponentensignals er- 
zeugt, einen Quantisierer umfasst, der das kombi- 
nationskodierte Spektralkomponentensignal quan- 
tisiert. 

17. Kodierer nach Anspruch 1 6, wobei der Quantisierer 
auf einem Wahrnehmungsmodell beruht. 

18. Dekodierer zum Dekodieren eines kodierten Si- 
gnals, urn ein rekonstruiertes Mehrkanal-Audiosi- 
gnal zu erzeugen, wobei das kodierte Signal ein 
Kombinationsstereokodieren eines ursprunglichen 
Mehrkanal-Audiosignals umfasst, und wobei das 
Verfahren Folgendes aufweist: 
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(a) einen Kombinationsstereodekodierer (23), R< 
der das Kombinationsstereodekodieren der ko- 
dierten Signale ausfuhrt, um eine Mehrzahl an 1. 
dekodierten Kanalsignalen zu erzeugen, wobei 

jedes dekodierte Kanalsignal eine Mehrzahl an 5 
dekodierten Spektralkomponenten-Pradikti- 
onsfehlersignalen umfasst; 

(b) Mittel zum Erzeugen eines ersten Spektral- 
komponentensignals auf der Grundlage eines 
oder mehrerer der Spektralkomponenten-Pra- 10 
diktionsfehlersignale, die in einem ersten der 
dekodierten Kanalsignale inbegriffen sind; 

(c) einen ersten Pradiktionsfilter (261 , 26r), der 
ein erstes Pradiktionssignal erzeugt, das eine 
Vorhersage eines zweiten Spektralkomponen- 15 
tensignals darstellt, wobei die Vorhersage auf 
dem ersten Spektralkomponentensignal be- 
ruht; 

(d) Mittel zum Erzeugen des zweiten Spektral- 
komponentensignals auf der Grundlage des er- 20 
sten Pradiktionssignals sowie eines oder meh- 
rerer der Spektralkomponenten-Pradiktions- 
fehlersignale, die in dem ersten der dekodier- 
ten Kanalsignale inbegriffen sind; und 

(e) eine erste Filterbank (241, 24r), die einen 25 
ersten Kanal des rekonstruierten Mehrkanal- 
Audiosignals auf der Grundlage der ersten und 
zweiten Spektralkomponentensignale erzeugt. 

19. Dekodierer nach Anspruch 18, welcher auGerdem 30 
Folgendes umfasst: 

(f) Mittel zum Erzeugen eines dritten Spektral- 
komponentensignals auf der Grundlage eines 
oder mehrerer der Spektralkomponenten-Pra- 35 
diktionsfehlersignale, die in einem zweiten der 
dekodierten Kanalsignale inbegriffen sind; . 2. 

(g) einen zweiten Pradiktionsfilter, welcher ein 
zweites Pradiktionssignal erzeugt, das eine 
Vorhersage eines vierten Spektralkomponen- 40 
tensignals darstellt, wobei die Vorhersage auf 
dem dritten Spektralkomponentensignal be- 
ruht; 

(h) Mittel zum Erzeugen des vierten Spektral- 
komponentensignals auf der Grundlage des 45 
zweiten Pradiktionssignals sowie eines oder 
mehrerer der Spektralkomponenten-Pradikti- 
onsfehlersignale, die in dem zweiten der deko- 
dierten Kanalsignale inbegriffen sind; und 

(i) eine zweite Filterbank, die einen zweiten Ka- 50 
nal des rekonstruierten Mehrkanal-Audiosi- 
gnals auf der Grundlage der dritten und vierten 
Spektralkomponentensignale erzeugt. 

20. Dekodierer nach Anspruch 1 8, wobei der Kombina- 55 
tionsstereodekodierer einen Intensitatsstereodeko- 
dierer umfasst, welcher das Intensitatsstereodeko- 
dieren des kodierten Signals ausfuhrt. 3. 



Procede d'execution d'un codage stereo combine 
d'un signal audio multicanaux afin de generer un si- 
gnal code, le procede comprenant les etapes sui- 
vantes: 

(a) execution d'une decomposition spectrale 
d'un premier signal de canal audio en une plu- 
rality de premiers signaux de composantes 
spectrales; 

(b) generation d'un premier signal de prediction 
representatif d'une prediction d'un desdits pre- 
miers signaux de composantes spectrales, la- 
dite prediction etant basee sur un ou plusieurs 
desdits premiers signaux de composantes 
spectrales; 

(c) comparaison du premier signal de predic- 
tion avec ledit un desdits premiers signaux de 
composantes spectrales afin de generer un 
premier signal d'erreur de prediction; 

(d) execution d'une decomposition spectrale 
d'un deuxieme signal de canal audio en une 
pluralite de deuxiemes signaux de composan- 
tes spectrales; 

(e) execution d'un codage stereo combine dudit 
un desdits premiers signaux de composantes 
spectrales et d'un desdits deuxiemes signaux 
de composantes spectrales afin de generer un 
signal de composantes spectrales code combi- 
ne, ledit codage etant base sur le premier signal 
d'erreur de prediction; et 

(f) generation du signal code sur la base du si- 
gnal de composantes spectrales code combi- 
ne. 

Procede selon la revendication 1, comprenant en 
outre les etapes suivantes: 

(g) generation d'un deuxieme signal de predic- 
tion representatif d'une prediction dudit un des- 
dits deuxiemes signaux de composantes spec- 
trales, ladite prediction etant basee sur un ou 
plusieurs autre(s) desdits deuxiemes signaux 
de composantes spectrales; et 

(h) comparaison du deuxieme signal de predic- 
tion avec ledit un desdits deuxiemes signaux 
de composantes spectrales afin de generer un 
deuxieme signal d'erreur de prediction; 

etdans lequel I'etape d'execution du codage stereo 
combine dudit un desdits premiers signaux de com- 
posantes spectrales et dudit un desdits deuxiemes 
signaux de composantes spectrales est en outre 
basee sur ledit deuxieme signal d'erreur de predic- 
tion. 

Procede selon la revendication 1 , dans lequel I'eta- 
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pe d'execution du codage stereo combine dudit un 
desdits premiers signaux de composantes spectra- 
les et dudit un desdits deuxiemes signaux de com- 
posantes spectrales comprend I'execution d'un co- 
dage stereo d'intensite dudit un desdits premiers si- 5 
gnaux de composantes spectrales et dudit un des- 
dits deuxiemes signaux de composantes spectra- 
les. 

4. Procede selon la revendication 1 , dans lequel I'eta- 10 
pe de generation du signal code sur la base du si- 
gnal de composantes spectrales code combine 
comprend la quantification du signal de composan- 
tes spectrales code combine. 

15 

5. Procede selon la revendication 4, dans lequel ladite 
quantification du signal de composantes spectrales 
code combine est basee sur un modele perceptif. 

6. Procede selon Tune quelconquedes revendications 20 
precedentes, comprenant en outre I'etape de stoc- 
kage dudit signal code sur un support de stockage. 

7. Procede selon la revendication 6, dans lequel le 
support de stockage comprend un disque compact. 25 

8. Procede selon la revendication 6, dans lequel le 
support de stockage comprend une bande audio 
numerique. 

30 

9. Procede selon la revendication 6, dans lequel le 
support de stockage comprend une memoire a 
semiconducteur. 

10. Procede de decoupage d'un signal code afin de ge- 35 
nerer un signal audio multicanaux reconstruit, le si- 
gnal code comprenant un codage stereo combine 
d'un signal audio multicanaux original, le procede 
comprenant les etapes suivantes: 

40 

(a) execution d'un decodage stereo combine 
du signal code afin de generer une pluralite de 
signaux de canal decodes, chaque signal de 
canal decode comprenant une pluralite de si- 
gnaux d'erreur de prediction de composantes 45 
spectrales decodes; 

(b) generation d'un premier signal de compo- 
santes spectrales base sur un ou plusieurs des- 
dits signaux d'erreur de prediction de compo- 
santes spectrales compris dans un premier 50 
desdits signaux de canal decodes; 

(c) generation d'un premier signal de prediction 
representatif d'une prediction d'un deuxieme si- 
gnal de composantes spectrales, ladite predic- 
tion etant basee sur ledit premier signal de 55 
composantes spectrales; 

(d) generation du deuxieme signal de compo- 
santes spectrales sur la base du premier signal 



de prediction et d'un ou de plusieurs desdits si- 
gnaux d'erreur de prediction de composantes 
spectrales compris dans le premier desdits si- 
gnaux de canal decodes; et 

(e) generation d'un premier canal du signal 
audio multicanaux reconstruit sur la base des 
premier et deuxieme signaux de composantes 
spectrales. 

1 1 . Procede selon la revendication 1 0, comprenant les 
etapes suivantes: 

(f) generation d'un troisieme signal de compo- 
santes spectrales sur la base d'un ou de plu- 
sieurs desdits signaux d'erreur de prediction de 
composantes spectrales compris dans un 
deuxieme desdits signaux de canal decodes; 

(g) generation d'un deuxieme signal de predic- 
tion representatif d'une prediction d'un quatrie- 
me signal de composantes spectrales, ladite 
prediction etant basee sur ledit troisieme signal 
de composantes spectrales; 

(h) generation du quatrieme signal de compo- 
santes spectrales sur la base du deuxieme si- 
gnal de prediction et d'un ou de plusieurs des- 
dits signaux d'erreur de prediction de compo- 
santes spectrales compris dans le deuxieme 
desdits signaux de canal decodes; et 

(i) generation d'un deuxieme canal du signal 
audio multicanaux reconstruit sur la base des 
troisieme et quatrieme signaux de composan- 
tes spectrales. 

12. Procede selon la revendication 10, dans lequel 
I'etape d'execution du decodage stereo combine du 
signal code comprend I'execution d'un decodage 
stereo d'intensite du signal code. 

13. Codeur pour executer un codage stereo combine 
d'un signal audio multicanaux afin de generer un si- 
gnal code, le codeur comprenant: 

(a) un premier banc de filtres (12L, 12R), qui 
execute une decomposition spectrale d'un pre- 
mier signal de canal audio en une pluralite de 
premiers signaux de composantes spectrales; 

(b) un premier filtre de prediction (16L, 16R, 
47), qui genere un premier signal de prediction 
representatif d'une prediction d'un desdits pre- 
miers signaux de composantes spectrales, le- 
dit filtre de prediction etant sensible a un ou a 
plusieurs autre(s) desdits premiers signaux de 
composantes spectrales; 

(c) un premier comparateur (48), qui compare 
le premier signal de prediction avec ledit un 
desdits premiers signaux de composantes 
spectrales afin de generer un premier signal 
d'erreur de prediction; 
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(d) un deuxieme banc de filtres (12L, 12R), qui 
execute une decomposition spectrale d'un 
deuxieme signal de canal audio en une pluralite 
de deuxiemes signaux de composantes spec- 
trales; 

(e) un codeur stereo combine (13), qui execute 
un codage stereo combine dudit un desdits pre- 
miers signaux de composantes spectrales et 
d'un desdits deuxiemes signaux de composan- 
tes spectrales afin de generer un signal de 
composantes spectrales code combine, ledit 
codage etant base sur le premier signal d'er- 
reur de prediction; et 

(f) un codeur (14), qui genere le signal code sur 
la base du signal de composantes spectrales 
code combine. 

14. Codeur selon la revendication 13, comprenant en 
outre: 

(g) un deuxieme filtre de prediction (16L, 16R, 
47), qui genere un deuxieme signal de predic- 
tion representatif d'une prediction dudit un des- 
dits deuxiemes signaux de composantes spec- 
trales, ladite prediction etant basee sur un ou 
plusieurs autre(s) desdits deuxiemes signaux 
de composantes spectrales; et 

(h) un deuxieme comparateur (48), qui compa- 
re le deuxieme signal de prediction avec ledit 
un desdits deuxiemes signaux de composan- 
tes spectrales afin de generer un deuxieme si- 
gnal de prediction d'erreur; 

et dans lequel le codeur stereo combine execute un 
codage stereo combine base en outre sur ledit 
deuxieme signal d'erreur de prediction. 

15. Codeur selon la revendication 13, dans lequel le co- 
deur stereo combine comprend un codeur stereo 
d'intensite, qui execute un codage stereo d'intensite 
dudit un desdits premiers signaux de composantes 
spectrales et dudit un desdits deuxiemes signaux 
de composantes spectrales. 

16. Codeurselon la revendication 1 3, dans lequel le co- 
deur qui genere le signal code sur la base du signal 
de composantes spectrales code combine com- 
prend un quantificateur, qui quantifie le signal de 
composantes spectrales code combine. 

17. Codeur selon la revendication 16, dans lequel le 
quantificateur est base sur un modele perceptif. 

18. Decodeur pour le decodage d'un signal code afin 
de generer un signal audio multicanaux reconstruit, 
le signal code comprenant un codage stereo com- 
bine d'un signal audio multicanaux original, le pro- 
cede comprenant : 



(a) un decodeur stereo combine (23), qui exe- 
cute un decodage stereo combine du signal co- 
de afin de generer une pluralite de signaux de 
canal decodes, chaque signal de canal decode 

5 comprenant une pluralite de signaux d'erreur 

de prediction de composantes spectrales de- 
codes; 

(b) des moyens pour generer un premier signal 
de composantes spectrales base sur un ou plu- 

10 sieurs desdits signaux d'erreur de prediction de 

composantes spectrales dans un premier des- 
dits signaux de canal decodes; 

(c) un premier filtre de prediction (26L, 26R), 
qui genere un premier signal de prediction re- 

15 presentatif d'une prediction d'un deuxieme si- 

gnal de composantes spectrales, ladite predic- 
tion etant basee sur ledit premier signal de 
composantes spectrales; 

(d) des moyens pour generer le deuxieme si- 
20 gnal de composantes spectrales sur la base du 

premier signal de prediction et d'un ou de plu- 
sieurs desdits signaux d'erreur de prediction de 
composantes spectrales compris dans le pre- 
mier desdits signaux de canal decodes; et 
25 (e) un premier banc de filtres (24L, 24R), qui 

genere un premier canal du signal audio multi- 
canaux reconstruit sur la base des premier et 
deuxieme signaux de composantes spectrales. 

30 19. Decodeur selon la revendication 18, comprenant en 
outre: 

(f) des moyens pour generer un troisieme si- 
gnal de composantes spectrales sur la base 

35 d'un ou de plusieurs signaux de prediction d'er- 

reur de composantes spectrales compris dans 
un deuxieme desdits signaux de canal deco- 
des; 

(g) un deuxieme filtre de prediction, qui genere 
40 un deuxieme signal de prediction representatif 

d'une prediction d'un quatrieme signal de com- 
posantes spectrales, ladite prediction etant ba- 
see sur ledit troisieme signal de composantes 
spectrales; 

45 (h) des moyens pour generer le quatrieme si- 

gnal de composantes spectrales sur la base du 
deuxieme signal de prediction etd'un ou de plu- 
sieurs des signaux d'erreur de prediction de 
composantes spectrales compris dans ledit 

50 deuxieme desdits signaux de canal decodes; et 

(i) un deuxieme banc de filtres, qui genere un 
deuxieme canal du signal audio a canaux mul- 
tiples reconstruit sur la base des troisieme et 
quatrieme signaux de composantes spectra- 

55 |es. 

20. Decodeur selon la revendication 18, dans lequel le 
decodeur stereo combine comprend un decodeur 
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stereo d'intensite, qui execute un decodage stereo 
d'intensite du signal code. 
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• FILTER COEFFICIENTS 
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TRANSMIT SIDE INFORMATION: 
•"PREDICTION FLAG" OFF 



J 

TO INTENSITY STEREO ENCODING 



fig. a 



FROM INTENSITY STEREO DECODING 

. J ST" 

.PREDICTION FLAG ON?> 




YES 
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NO 



DECODE SIDE INFORMATION: 

TARGET FREQUENCY RANGE 
ORDER OF PREDICTION FILTER 
FILTER COEFFICIENTS 



I 
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APPLY INVERSE FILTER TO SPECTRAL COEFFICIENTS 



I 



I 
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CALC. OUTPUT TIME SIGNAL (RUN SYNTHESIS F ILTERBANK) 
1 
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